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SYSTEM FOR INDEPENDENT POWERING 
OF A COMPUTER SYSTEM 

RELATED APPLICATIONS 

The subject matter of U.S. Patent Application entitled 
"Method of Independent Powering of Diagnostic Processes 
on a Computer System/' filed on Oct. 1, 1997, application 
Ser. No. 08/942,320, and having attorney Docket No. 
MNFRAME.002A4 is related to this application. 

COPYRIGHT RIGHTS 
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patent files or records, but otherwise reserves all copyright 
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BACKGROUND OF THE INVENTION 

1. Field of the Invention 

The invention relates to fault tolerant computer systems. 
More specifically, the invention is directed to a system for 
providing remote access and control of server environmental 
management. 

2. Description of the Related Technology 
As enterprise -class servers become more powerful and 

more capable, they are also becoming increasingly sophis- 
ticated and complex. For many companies, these changes 
lead to concerns over server reliability and manageability, 
particularly in light of the increasingly critical role of 
server-based applications. While in the past many systems 
administrators were comfortable with all of the various 
15 components that made up a standards-based network server, 
today's generation of servers can appear as an 
incomprehensible, unmanageable black box. Without vis- 
ibility into the underlying behavior of the system, the 
administrator must "fly blind/' Too often the only indicators 
20 the network manager has on the relative health of a particu- 
lar server is whether or not it is running. 

It is well-acknowledged that there is a lack of reliability 
and availability of most standards-based servers. Server 
downtime, resulting either from hardware or software faults 
25 or from regular maintenance, continues to be a significant 
problem. By one estimate, the cost of downtime in mission 
critical environments has risen to an annual total of $4.0 
billion for U.S. businesses, with the average downtime event 
resulting in a $140 thousand loss in the retail industry and a 
$450 thousand loss in the securities industry. It has been 
reported that companies lose as much as $250 thousand in 
employee productivity for every 1% of computer downtime. 
With emerging Internet, intranet and collaborative applica- 
tions taking on more essential business roles every day, the 
35 cost of network server downtime will continue to spiral 
upward. 

While hardware fault tolerance is an important element of 
an overall high availability architecture, it is only one piece 
of the puzzle. Studies show that a significant percentage of 
40 network server downtime is caused by transient faults in the 
I/O subsystem. These faults may be due, for example, to the 
device driver, the adapter card firmware, or hardware which 
does not properly handle concurrent errors, and often causes 
servers to crash or hang. The result is hours of downtime per 
45 failure, while a system administrator discovers the failure 
takes some action, and manually reboots the server. In many 
cases, data volumes on hard disk drives become corrupt and 
must be repaired when the volume is mounted. A dismount- 
and-mount cycle may result from the lack of "hot plugga- 
50 bility" in current standards-based servers. Diagnosing inter- 
mittent errors can be a frustrating and time-consuming 
process. For a system to deliver consistently high 
availability, it must be resilient to these types of faults. 
Accurate and available information about such faults is 
55 central to diagnosing the underlying problems and taking 
corrective action. 

Modern fault tolerant systems have the functionality to 
provide the ambient temperature of a storage device enclo- 
sure and the operational status of other components such as 
60 the cooling fans and power supply. However, a limitation of 
these server systems is that they do not contain self- 
managing processes to correct malfunctions. Also, if a 
malfunction occurs in a typical server, it relies on the 
operating system software to report, record and manage 
recovery of the fault. However, many types of faults will 
prevent such software from carrying out these tasks. For 
example, a disk drive failure can prevent recording of the 
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fault in a log file on that disk drive. If the system error caused supply that supplies power to the first computer; a remote 

the system to power down, then the system administrator interface power supply that is independent from the first 

would never know the source of the error. computer power supply; and a remote interface circuit that 

Traditional systems are lacking in detail and sophistica- receives power from the remote interface power supply and 

tion when notifing system administrators of system malfunc- S is capable of providing independent power to portions of the 

tions. System administrators are in need of a graphical user first computer to facilitate reading of the status information, 

interface for monitoring the health of a network of servers. Another embodiment of the present invention is a system 

Administrators need a simple point-and-click interface to f or independent powering of a first computer, comprising a 

evaluate the health of each server in the network. In addition, fi^t computer storing status information; a first computer 

existing fault tolerant servers rely upon operating system 10 power supply that supplies power to the first computer; a 

maintained logs for error recording. These systems are not remote interface power supply; and a remote interface 

capable of maintaining information when the operating circuit that receives power from the remote interface power 

system is inoperable due to a system malfunction. Existing supply and is capable of providing power to at least a portion 

systems do not have a system log for maintaining informa- G f the first computer. 

tion when the main computational processors are inoperable 15 Yet anQther embodiment of tne present invention is a 

or the operating system has crashed tem for indeperjdent p 0W ering of a power-on process on 

Another limitation of the typical fault tolerant system is a utcr comprising a first computer having a first 

that the control logic for the diagnostic system is associated computer power supp i y; an d a remote interface circuit that 

with a particular processor. Thus, if the environmental fcccives f from a f , that is impendent 

control processor malfunctioned, then all diagnostic activity 20 from the fifSt compmer power supply and pr0 vides indepen- 

on the computer would cease. In traditional systems, if a dent f fo i[om of (he fifst to fao iiitatc 

controller dedicated to the fan system failed, then all fan remotely p0W ering up the first computer if the first computer 

activity could cease resulting m overheating and ultimate f { fc ating below a predetermined threshold, 
failure of the server. What is desired is a way to obtain 

diagnostic information when the server OS is not operational 25 BRIEF DESCRIPTION OF THE DRAWINGS 
or even when main power to the server is down. 

Existing fault tolerant systems also lack the power to FIG. 1 is a top level block diagram of microcontroller 

remotely control a particular server, such as powering up and network components utilized by an embodiment of the 

down, resetting, reading system status, displaying flight present invention. 

recorder and so forth. Such control of the server is desired 30 FIG. 2 is a block diagram of the server portion of the 

even when the server power is down. For example, if the microcontroller network shown in FIG. 1. 

operating system on the remote machine failed, then a FIG. 3 is a block diagram of one embodiment of a remote 

system administrator would have to physically go to the interface board (RIB) that is part of the microcontroller 

remote machine to re-boot the malfunctioning machine network shown in FIGS. 1 and 2. 

before any system information could be obtained or diag- 35 FIG . 4 is a diagram of one embodimentof a serial protocol 

noshes could be started. meS sage formats utilized by the RIB shown in FIG. 3. 

Therefore, a need exists for improvements in server „ * . . , . — . . . 

management which will result in greater reliability and HG. 3 is a schematic diagram of a bias power portion of 

dependability of operation. Server users are in need of a the RIB shown in mG - 3< 

management system by which the users can accurately 40 FIG. 6 is a schematic diagram of a bias power portion of 

gauge the health of their system. Users need a high avail- the server system board in the server system of FIG. 2. 
ability system that must not only be resilient to faults, but 

must allow for maintenance, modification, and growth— DETAILED DESCRIPTION OF THE 

without downtime. System users must be able to replace INVENTION 

failed components, and add new functionality, such as new 45 Thc following dct ailed description presents a description 

network interfaces, disk interface cards and storage, without of certain specific embodiments of the present invention, 

impacting existing users. As system demands grow, orgam- However, the present invention can be embodied in a 

zations must frequently expand, or scale, their computing mll ititude of different ways as defined and covered by the 

infrastructure, adding new processing power, memory, stor- daims In this dcscriptkm7 re f e rence is made to the drawings 

age and I/O capacity. With demand for 24-hour access to 50 wherein Hke parls afe designated ^th like numerals 

critical, server-based information resources, planned system throughout. 

downtime for system service or expansion has become __ . . , . . .„ . . . . t 

acce table convenience, the description will be organized into 

unaccep following principal sections: Introduction, Server 

SUMMARY OF THE INVENTION 5S System, Microcontroller Network, Remote Interface Board, 

The inventive remote access system provides system Remote Interface Serial Protocol, Microcontroller Network 

administrators with new levels of client/server system avail- ^* as P° wer - 

ability and management. It gives system administrators and j INTRODUCTION 
network managers a comprehensive view into the underly- 
ing health of the server— in real time, whether on-site or 60 The inventive computer server system and client com- 
off-site. In the event of a failure, the invention enables the puter includes a distributed hardware environment manage- 
administrator to learn why the system failed, why the system ment system that is built as a small self-contained network 
was unable to boot, and to control certain functions of the of microcontrollers. Operating independently of the system 
server from a remote station. processor and operating software, embodiments of the 
One embodiment of the present invention is a system for 65 present invention use separate processors for providing 
independent powering of a first computer, comprising a first information and managing the hardware environment 
computer storing status information; a first computer power including fans, power supplies and temperature. 
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Initialization, modification and retrieval of system condi- 
tions are performed through utilization of a remote interface 
by issuing commands to the environmental processors. The 
system conditions may include system log size, presence of 
faults in the system log, serial number for each of the 5 
environmental processors, serial numbers for each power 
supply of the system, system identification, system log 
count, power settings and presence, canister presence, 
temperature, BUS/CORE speed ratio, fan speeds, settings 
for fan faults, LCD display, Non-Maskable Interrupt (NMI) 10 
request bits, CPU fault summary, FRU status, JTAG enable 
bit, system log information, remote access password, over- 
temperature fault, CPU error bits, CPU presence, CPU 
thermal fault bits, and remote port modem. The aforemen- 
tioned list of capabilities provided by the present environ- 15 
mental system is not all-inclusive. 

The server system and client computer provides mecha- 
nisms for the evaluation of the data that the system collects 
and methods for the diagnosis and repair of server problems 
in a manner that system errors can be effectively and 20 
efficiently managed. The time to evaluate and repair prob- 
lems is minimized. The server system ensures that the 
system will not go down, so long as sufficient system 
resources are available to continue operation, but rather 
degrade gracefully until the faulty components can be 25 
replaced. 

II. SERVER SYSTEM 

Referring to FIG. 1, a server system 100 with a remote 3Q 
client computer will be described. In one embodiment, the 
server system hardware environment 100 may be built 
around a self-contained network of microcontrollers, such 
as, for example, a remote interface microcontroller on the 
remote interface board or circuit 104, a system interface 35 
microcontroller 106 and a system recorder microcontroller 
110. This distributed service processor network 102 may 
operate as a fully self-contained subsystem within the server 
system 100, continuously monitoring and managing the 
physical environment of the machine (e.g., temperature, 4Q 
voltages, fan status). The microcontroller network 102 con- 
tinues to operate and provides a system administrator with 
critical system information, regardless of the operational 
status of the server 100. 

Information collected and analyzed by the microcontrol- 45 
ler network 102 can be presented to a system administrator 
using either SNMP-based system management software (not 
shown), or using microcontroller network Recovery Man- 
ager software 130 through a local connection 121 or a dial-in 
connection 123. The system management software, which 50 
interfaces with the operating software (OS) 108 such as 
Microsoft Windows NT Version 4.0 or Novell Netware 
Version 4.11, for example, provides the ability to manage the 
specific characteristics of the server system, including Hot 
Plug Peripheral Component Interconnect (PCI), power and 55 
cooling status, as well as the ability to handle alerts asso- 
ciated with these features. 

The microcontroller network Recovery Manager software 
130 allows the system administrator to query the status of 
the server system 100 through the microcontroller network 60 
102, even when the server is down. Using the microcon- 
troller network remote management capability, a system 
administrator can use the Recovery Manager 130 to re-start 
a failed system through a modem connection 123. First, the 
administrator can remotely view the microcontroller net- 65 
work Flight Recorder, a feature that stores all system 
messages, status and error reports in a circular Non-Volatile 
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Random Access Memory buffer (NVRAM) 112. Then, after 
determining the cause of the system problem, the adminis- 
trator can use microcontroller network "fly by wire" capa- 
bility to reset the system, as well as to power the system off 
or on. "Fly by wire" denotes that no switch, indicator or 
other control is directly connected to the function it monitors 
or controls, but instead, all the control and monitoring 
connections are made by the microcontroller network 102. 

The remote interface board (RIB) 104 interfaces the 
server system 100 to an external client computer. The RIB 
104 connects to either a local client computer 122 at the 
same location as the server 100 or to remote (or link) client 
computer 124 through an optional switch 120. The client 
computer 122/124 may in one embodiment run either 
Microsoft Windows 95 or Windows NT Workstation version 
4.0 operating software (OS) 132. The processor and RAM 
requirements of the client computer 122/124 are such as 
necessary by the OS 132. The serial port of the client 
computer 122/124 may utilize a type 16550A Universal 
Asynchronous Receiver Transmitter (UART). The switch 
facilitates either the local connection 121 or the modem 
connection 123 at any one time, but allows both types of 
connections to be connected to the switch. In an another 
embodiment, either the local connection 121 or the modem 
connection 123 is connected directly to the RIB 104. The 
local connection 121 utilizes a readily available null-modem 
serial cable to connect to the local client computer. The 
modem connection may utilize a Hayes-compatible server 
modem 126 and a Hayes-compatible client modem 128. In 
one embodiment, a model V.34X 33. 6K data/fax modem 
available from Zoom is utilized as the client modem and the 
server modem. In another embodiment, a Sportster 33. 6 K 
data/fax modem available from US Robotics is utilized as 
the client modem. 

The steps of connecting the remote client computer 124 to 
the server 100 will now be briefly described. The remote 
interface 104 has a serial port connector 204 (FIG. 3) that 
directly connects with a counterpart serial port connector of 
the external server modem 126 without the use of a cable. If 
desired, a serial cable could be used to interconnect the 
remote interface 104 and the server modem 126. The cable 
end of an AC to DC power adapter (not shown, for example 
a 120 Volt AC to 7.5 Volt DC, or a 220V, European or 
Japanese adapter) is then connected to the DC power con- 
nector J2 (220, FIG. 3) of the remote interface, while the 
double -prong end is plugged into a 120 Volt AC wall outlet. 
One end of an RJ-45 parallel-wire data cable 103 is then 
plugged into an RJ-45 jack (226, FIG. 3) on the remote 
interface 104, while the other end is plugged into a RJ-45 
Recovery Manager jack on the server 100. The RJ-45 jack 
on the server then connects to the microcontroller network 
102. The server modem 126 is then connected to a commu- 
nications network 127 using an appropriate connector. The 
communications network 127 may be a public switched 
telephone network, although other modem types and com- 
munication networks are envisioned. For example, if cable 
modems are used for the server modem 126 and client 
modem 128, the communications network can be a cable 
television network. As another example, satellite modulator/ 
demodulators can be used in conjunction with a satellite 
network. 

At the remote client computer 124, a serial cable (25-pin 
D-shell) 129 is used to interconnect the client modem 128 
and the client computer 124. The client modem 128 is then 
connected to the communications network 127 using an 
appropriate connector. Each modem is then plugged into an 
appropriate power source for the modem, such as an AC 
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outlet. At this time, the Recovery Manager software 130 is 
loaded into the client computer 124, if not already present, 
and activated. 

The steps of connecting the local client computer 122 to 
the server 100 are similar, but modems are not necessary. 
The main difference is that the serial port connector of the 
remote interface 104 connects to a serial port of the local 
client computer 122 by the null-modem serial cable 121. 

III. MICROCONTROLLER NETWORK 

In one embodiment, the invention is implemented by a 
network of microcontrollers 102 (FIG. 1). The microcon- 
trollers may provide functionality for system control, diag- 
nostic routines, self -maintenance control, and event logging 
processors. A further description of the microcontrollers and 15 
microcontroller network is provided in U.S. patent applica- 
tion Ser. No. 08/942,402, entitled "Diagnostic and Manag- 
ing Distributed Processor System", and in U.S. patent appli- 
cation Ser. No. 08/942,160, entitled "System Architecture 
For Remote Access and Control of Environmental Manage- 20 
ment". 

Referring to FIG. 2, in one embodiment of the invention, 
the network of microcontrollers 102 includes ten processors. 
One of the purposes of the microcontroller network 102 is to 
transfer messages to the other components of the server 2 s 
system 100. The processors may include: a System Interface 
controller 106, a CPU A controller 166, a CPU B controller 
168, a System Recorder 110, a Chassis controller 170, a 
Canister A controller 172, a Canister B controller 174, a 
Canister C controller 176, a Canister D controller 178 and a 
Remote Interface controller 200. The Remote Interface 
controller 200 is located on the RIB 104 (FIG. 1) which is 
part of the server system 100, but may preferably be external 
to a server enclosure. The System Interface controller 106, 
the CPU A controller 166 and the CPU B controller 168 are 
located on a system board 150 in the server 100. Also located 
on the system board are one or more central processing units 
(CPUs) or microprocessors 164 and an Industry Standard 
Architecture (ISA) bus 162 that connects to the System 
Interface Controller 106. Of course, other buses such as PCI, 40 
EISA and microchannel may be used. The CPU 164 may be 
any conventional general purpose single-chip or multi-chip 
microprocessor such as a Pentium®, Pentium® Pro or 
Pentium® II processor available from Intel Corporation, a 
SPARC processor available from Sun Microsystems, a 
MIPS® processor available from Silicon Graphics, Inc., a 
Power PC® processor available from Motorola, or an 
ALPHA® processor available from Digital Equipment Cor- 
poration. In addition, the CPU 164 may be any conventional 
special purpose microprocessor such as a digital signal 
processor or a graphics processor. 

The System Recorder 110 and Chassis controller 170, 
along with the NVRAM 112 that connects to the System 
Recorder 110, may be located on a backplane 152 of the 
server 100. The System Recorder 110 and Chassis controller ss 
170 are typically the first microcontrollers to power up when 
server power is applied. The System Recorder 110, the 
Chassis controller 170 and the Remote Interface microcon- 
troller 200 are the three microcontrollers that have a bias 5 
volt power supplied to them. If the main server power is off, 60 
an independent power supply source for the bias 5 volt 
power is provided by the RIB 104 (FIG. 1). The Canister 
controllers 172-178 are not considered to be part of the 
backplane 152 because they are located on separate cards 
and are removable. 65 

Each of the microcontrollers has a unique system identi- 
fier or address. The addresses are as follows in Table 1: 



30 



35 



45 



50 
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TABLE 1 



Microcontroller 


Address 


System Interface controller 106 


10 


CPU A controller 166 


03 


CPU B controller 168 


04 


System Recorder 110 


01 


Chassis controller 170 


02 


Canister A controller 172 


20 


Canister B controller 174 


21 


Canister C controller 176 


22 


Canister D controller 178 


23 


Remote [nterface controller 200 


11 



The microcontrollers may be Microchip Technologies, 
Inc. PIC processors in one embodiment, although other 
microcontrollers such as an 8051 available from Intel, an 
8751 available from Atmel, and a P80CL580 microproces- 
sor available from Philips, could be utilized. The PIC16C74 
(Chassis controller 170) and PIC16C65 (the other 
controllers) are members of the PIC16CXX family of 
CMOS, fully-static, EPROM-based 8-bit microcontrollers. 
The PIC controllers have 192 bytes of RAM, in addition to 
program memory, three timer/counters, two capture/ 
compare/Pulse Width Modulation modules and two serial 
ports. The synchronous serial port is configured as a two- 
wire Inter-Integrated Circuit (I 2 C) bus in one embodiment of 
the invention. The PIC controllers use a Harvard architecture 
in which program and data are accessed from separate 
memories. This improves bandwidth over traditional von 
Neumann architecture processors where program and data 
are fetched from the same memory. Separating program and 
data memory further allows instructions to be sized differ- 
ently than the 8 -bit wide data word. Instruction opcodes are 
14-bit wide making it possible to have all single word 
instructions. A 14-bit wide program memory access bus 
fetches a 14-bit instruction in a single cycle. 

In one embodiment of the invention, the microcontrollers 
communicate through an I 2 C serial bus, also referred to as 
a microcontroller bus 160. The document "The I*C Bus and 
How to Use It" (Philips Semiconductor, 1992) is hereby 
incorporated by reference. The I*C bus is a bidirectional 
two-wire bus and may operate at a 400 kbps rate. However, 
other bus structures and protocols could be employed in 
connection with this invention. For example, Apple Com- 
puter ADB, Universal Serial Bus, IEEE-1394 (Firewire), 
IEEE-488 (GPIB), RS-485, or Controller Area Network 
(CAN) could be utilized as the microcontroller bus. Control 
on the microcontroller bus is distributed. Each microcon- 
troller can be a sender (a master) or a receiver (a slave) and 
each is interconnected by this bus. Amicrocontroller directly 
controls its own resources, and indirectly controls resources 
of other microcontrollers on the bus. 

Here are some of the features of the PC-bus: 

Two bus lines are utilized: a serial data line (SDA) and a 

serial clock line (SCL). 
Each device connected to the bus is software addressable 
by a unique address and simple master/slave relation- 
ships exist at all times; masters can operate as master- 
transmitters or as master-receivers. 
The bus is a true multi-master bus including collision 
detection and arbitration to prevent data corruption if 
two or more masters simultaneously initiate data trans- 
fer. 

Serial, 8 -bit oriented, bidirectional data transfers can be 
made at up to 400 kbit/second in the fast mode. 
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Two wires, serial data (SDA) and serial clock (SCL), 
carry information between the devices connected to the I C 
bus. Each device is recognized by a unique address and can 
operate as either a transmitter or receiver, depending on the 
function of the device. For example, a memory device 
connected to the I 2 C bus could both receive and transmit 
data. In addition to transmitters and receivers, devices can 
also be considered as masters or slaves when performing 
data transfers (see Table 2). A master is the device which 
initiates a data transfer on the bus and generates the clock 
signals to permit that transfer. At that time, any. device 
addressed is considered a slave. 

TABLE 2 



Term 



Definition of PC-bus terminology 
Description 



Transmitter 

Receiver 

Master 

Slave 

Multi-master 
Arbitration 



Synchronization 



The device which sends the data to the bus 
The device which receives the data from the bus 
The device which initiates a transfer, generates clock 
signals and terminates a transfer 
The device addressed by a master 
More than one master can attempt to control the bus at 
the same time without corrupting the message 
Procedure to ensure that, if more than one master 
simultaneously tries to control the bus, only one is 
allowed to do so and the message is not corrupted 
Procedure to synchronize the clock signal of two 01 
more devices 



The I 2 C-bus is a multi-master bus. This means that more 
than one device capable of controlling the bus can be 
connected to it. As masters are usually microcontrollers, 
consider the case of a data transfer between two microcon- 
trollers connected to the I 2 C-bus. This highlights the master- 
slave and receiver- transmitter relationships to be found on 
the I 2 C-bus. It should be noted that these relationships are 
not permanent, but depend on the direction of data transfer 
at that time. The transfer of data would proceed as follows: 

1) Suppose microcontroller A wants to send information 
to microcontroller B: 

microcontroller A (master), addresses microcontroller 
B (slave); 

microcontroller A (master-transmitter), sends data to 

microcontroller B (slave-receiver); 
microcontroller A terminates the transfer. 

2) If microcontroller A wants to receive information from 
microcontroller B: 

microcontroller A (master addresses microcontroller B 
(slave); 

microcontroller A (master- receiver) receives data from 

microcontroller B (slave-transmitter); 
microcontroller A terminates the transfer. 

Even in this situation, the master (microcontroller A) 
generates the timing and terminates the transfer. 

The possibility of connecting more than one microcon- 
troller to the I 2 C-bus means that more than one master could 
try to initiate a data transfer at the same time. To avoid the 
chaos that might ensue from such an event, an arbitration 
procedure has been developed. This procedure relies on the 
wired-AND connection of all I 2 C interfaces to the I 2 C-bus. 

If two or more masters try to put information onto the bus, 
the first to produce a 'one* when the other produces a 'zero' 
will lose the arbitration. The clock signals during arbitration 
are a synchronized combination of the clocks generated by 
the masters using the wired-AND connection to the SCL 
line. 

Generation of clock signal on the I 2 C-bus is the respon- 
sibility of master devices. Each master microcontroller gen- 
erates its own clock signals when transferring data on the 
bus. 



The command, diagnostic, monitoring and history func- 
tions of the microcontroller network 102 are accessed using 
a global network memory model in one embodiment. That 
is, any junction may be queried simply by generating a 
s network "read" request targeted at the function's known 
global network address. In the same fashion, a function may 
be exercised simply by "writing" to its global network 
address. Any microcontroller may initiate read/write activity 
by sending a message on the I 2 C bus to the microcontroller 
10 responsible for the function (which can be determined from 
the known global address of the function). The network 
memory model includes typing information as part of the 
memory addressing information. 

Using a network global memory model in one embodi- 
15 ment places relatively modest requirements for the 
message protocol. 
All messages conform to the I^C message format includ- 
ing addressing and read/write indication. 
All 1 2 C messages use seven bit addressing. 
20 Any controller can originate (be a Master) or respond (be 
a Slave). 

All message transactions consist of I 2 C "Combined for- 
mat" messages. This is made up of two back- to-back 
M I^C simple messages with a repeated START condition 
between (which does not allow for re-arbitrating the 
bus). The first message is a Write (Master to Slave) and 
the second message is a Read (Slave to Master). 

Two types of transactions are used: Memory-Read and 
30 Memory-Write. 

Sub-Addressing formats vary depending on data type 
being used. 

IV. REMOTE INTERFACE BOARD 

35 Referring to FIG. 3, the remote interface board (RIB) 104, 
previously shown in FIG. 1, will now be described. The RIB 
is an interface between the microcontroller network 102 
(FIG. 1) of the server system 100 and an external client 
computer 122/124. The server system status and commands 

40 are passed through the RS232 connector port 204 at the 
client side of the RIB to the microcontroller network 102 on 
the server 100, controlled through the on-board PIC16C65 
microcontroller 200. Signals in the microcontroller network 
102 are transported by the microcontroller bus 160 (FIG. 2). 

45 In one embodiment, the microcontroller bus 160 utilizes the 
PC bus protocol, previously described. The signals on the 
microcontroller bus 160 are received from the server 100 by 
the RIB 104 on the RJ-45 cable 103 and are translated by the 
PIC16C65 microcontroller 200 into an eight signal RS232 

50 protocol. These RS232 signals are passed through a RS232 
line transceiver 202, such as aLT1133Achip available from 
Linear Technology, with a baud rate capable of reaching the 
speed of 120 kbaud. A25 pin D-Sub connector 204 connects 
to the other side of the line transceiver 202 and provides the 

55 point at which either the local client computer 122 or the 
server modem 126 makes a connection. 

The two wire microcontroller bus 160 is brought in from 
the server 100 and passed to the microcontroller 200 using 
the RJ-45 cable 103 and RJ-45 connector 226. Aswitch 228, 

60 such as a QS3126 switch available from Quick Logic, 
connects to the RJ-45 connector 226 and provides isolation 
for the data and clock bus signals internal and external to the 
RIB 104. If the RIB 104 and switch 228 have power, the 
switch 228 feeds the bus signals through to a microcontroller 

65 bus extender 230. Otherwise, if the switch 228 does not have 
power, the microcontroller bus 160 is isolated from the RIB 
104. The bus extender 230 connects between the switch 228 
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and the microcontroller 200. The bus extender 230 is a buffer 
providing drive capability for the clock and data signals. In 
one embodiment, the bus extender 230 is a 82B715 chip 
available from Philips Semiconductor. Microcontroller 200 
Port C, bit 3 is the clocking bit and Port C, bit 4 is the data 5 
line. 

Communication with the server modem 126 is based on 
the RS232 protocol. The microcontroller 200 generates the 
receive and the transmit signals, where the signal levels are 
transposed to the RS232 levels by the LT1133A line trans- *° 
ceiver202. There are three transmit signals, RTS, SOUT and 
DTR, which are from Port A, bits 2, 3 and 4 of the 
microcontroller 200, whereas the five receive signals are 
from two ports, DCD, DSR from Port C, bits 1 and 0 and 
SIN, CTS and RI from Port A, bits 5, 0 and 1. 15 

In one embodiment the 25 pin RS232 pin connector 204 
is used instead a 9 pin connector, since this type of connector 
is more common than the other. All the extra pins are not 
connected except the pins 1 and 7, where pin 1 is chassis 
ground and pin 7 is a signal ground. 20 

A static random access memory (SRAM) 208 connects to 
the microcontroller 200. In one embodiment, the SRAM 208 
is a 32kx8 MT5LC2568 that is available from Micron 
Technology. The SRAM 208 is also available from other 
memory manufacturers. An external address register 206, 
such as an ABT374, available from Texas Instruments is 
used for latching the higher addressing bits (A8-A14) of the 
address for the SRAM 208 so as to expand the address to 
fifteen bits. The SRAM 208 is used to store system status 3Q 
data, system log data from the NVRAM 112 (FIG. 1), and 
other message data for transfer to the external interface port 
204 or to a microcontroller on the microcontroller bus 160 
(FIG. 2). 

Port D of the microcontroller 200 is the address port. Port 35 
B is the data bus for the bi-directional data interconnect. Port 
E is for the SRAM enable, output tristate and write control 
signals. The microcontroller 200 operates at a frequency of 
12 MHz. 

A Erasable Programmable Read Only Memory (EPROM) 40 
212 is used for storing board serial number identification 
information for the RIB 104. The serial number memory 212 
is signal powered, retaining the charge into a capacitor 
sourced through the data line. In one embodiment, the serial 
number memory 212 stores eight sixteen-byte serial/revision 45 
numbers (for maintaining the rework/revision history) and is 
a DS2502 chip available from Dallas Semiconductor. The 
programming of memory 212 is handled using a jumper 
applied through an external connector JI 210. The serial 
number memory 212 connects to the microcontroller 200 at 50 
Port C, bit 6 and to the external connector JI 210. 

The RIB 104 may be powered through a 7.5 Volt/800 mA 
supply unit that plugs into a connector J2 220. In one 
embodiment, the supply unit is 120 Volt AC to DC wall 



adapter. Connector J2 220 feeds a LT1376 high frequency 
switching regulator 222, available from Linear Technology, 
which regulates the power source. The regulated power 
output is used locally by the components on the RIB 104, 
and 300 mA are sourced to the microcontroller network 102 
through a 300 mA fuse 224 and the RJ-45 connector 226. 
Thus, the output of the regulator 222 provides an alternative 
source for a bias-powered partition of the microcontroller 
network 102. The bias-powered partition includes the sys- 
tem recorder 110 (FIG. 1), the NVRAM 112 and the Chassis 
controller 170 (FIG. 2) which are resident on the server 
backplane 152. 

V. REMOTE INTERFACE SERIAL PROTOCOL 

The microcontroller network remote interface serial pro- 
tocol communicates microcontroller network messages 
across a point-to-point serial link. This link is between the 
RIB controller 200 that is in communication with the 
Recovery Manager 130 at the remote client 122/124. This 
protocol encapsulates microcontroller network messages in 
a transmission packet to provide error-free communication 
and link security. 

In one embodiment, the remote interface serial protocol 
uses the concept of byte stuffing. This means that certain 
byte values in the data stream have a particular meaning. If 
that byte value is transmitted by the underlying application 
as data, it must be transmitted as a two-byte sequence. 

The bytes that have a special meaning in this protocol are: 



SOM 306 


Start of a message 


EOM 316 


End of a message 


SUB 


The next byte in the data stream must be substituted 




before processing. 


INT 320 


Event Interrupt 


Data 312 


An entire microcontroller network message 



As stated above, if any of these byte values occur as data 
in a message, a two-byte sequence must be substituted for 
that byte. The sequence is a byte with the value of SUB, 
followed by a type with the value of the original byte, which 
is incremented by one. For example, if a SUB byte occurs in 
a message, it is transmitted as a SUB followed by a byte that 
has a value of SUB+1. 

Referring to FIG. 4, the two types of messages 300 used 
by the remote interface serial protocol will be described. 

1. Requests 302, which are sent by remote management 
(client) computers 122/124 (FIG. 1) to the remote 
interface 104. 

2. Responses 304, which are returned to the requester 
122/L24 by the remote interface 104. 

The fields of the messages are defined as follows: 



SOM 306 A special data byte value marking the start of a message. 

EOM 316 A special data byte value marking the end of a message. 

Seq.tf 308 A one-byte sequence number, which is incremented on 

each request. It is stored in the response. 
TYPE 310 One of the following types of requests: 

IDENTIFY Requests the remote interface to send back identification 

information about the system to which it is connected. 

It also resets the next expected sequence number. 

Security authorization does not need to be established 

before the request is issued. 
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-continued 



SECURE Establishes secure authorization on the serial link by 

checking password security data provided in the message 
with the microcontroller network password. 

UNSECURE Clears security authorization on the link and attempts to 

disconnect it. This requires security authorization to 
have been previously established. 

MESSAGE Passes the data portions of the message to the 

microcoatroller network for execution. The response 
from the microcontroller network is sent back in the data 
portion of the response. This requires security 
authorization to have been previously established. 

POLL Queries the status of the remote interface. This request 

is generally used to determine if an event is pending in 
the remote interface. 
STATUS 318 One of the following response status values: 

OK Everything relating to communication with the remote 

interface is successful. 

OK_EVENT Everything relating to communication with the remote 

interface is successful. In addition, there is one or more 
events pending in the remote interface. 

SEQUENCE The sequence number of the request is neither the 

current sequence number or retransmission request, nor 
the next expected sequence number or new request 
Sequence numbers may be reset by an IDENTIFY 
request. 

CHECK The check byte in the request message is received 

incorrectly. 

FORMAT Something about the format of the message is incorrect. 

Most likely, the type field contains an invalid value. 
SECURE The message requires that security authorization be in 

effect, or, if the message has a TYPE value of SECURE, 

the security check failed. 
Check 314 Indicates a message integrity check byte. Currently the 

value is 256 minus the sum of previous bytes in the 

message. For example, adding all bytes in the message 

up to and including the check byte should produce a 

result of zero (0). 

INT 320 A special one-byte message sent by the remote interface 

when it detects the transition from no events pending to 
one or more events pending. This message can be used 
to trigger reading events from the remote interface. 
Events should be read until the return status changes 
form OKJVENT to OK 



VI. MICROCONTROLLER NETWORK BIAS 
POWER 

There are two separate 5 voltpower sources associated 
with the server system 100: a 5 voltbias power that is 
supplied to the Chassis controller 170 (FIG. 2) and the 
System Recorder 110 by a server power supply whenever 
AC power is enabled, and a 5 Volt (5V) general or main 
power that is also provided by the server power supply. Bias 
power is considered to be low current (generally less than 
one Amp, e.g., 300 mA) but has less delay than general 
power when the supply is initially turned on. General 5V 
power is controlled through the Chassis controller 170. 
When the server system 100 is down, i.e., the general 5V 
power is off, the microcontroller network 102 (FIG. 1) is still 
electronically responsive via the remote interface board 104 
to the Chassis controller 170 and System Recorder 110. 
Commands can be issued from a software application run- 
ning on the local client computer 122 or remote client 
computer 124 to turn on the general 5V power, read the 
system log, check system type, and so forth. 

When the general 5V power is off at the server 100, the 
5 V bias power supplied by the server power supply will also 
be off. However, as long as the independent power supply 
360 located at the remote interface 104 is operational, the 
remote interface board provides the 5 V bias power and sends 
it via the RJ-45 cable 103 (FIG. 1) to the Chassis controller 
170 and the System Recorder 110 on the microcontroller 



45 



40 

network 102. This power supply 360 could be a battery, or 
an AC/DC adapter or any other source of electrical power. 

Referring to FIG. 5, the bias power portion of the remote 
interface board 104 will be described. As previously 
described, the independent RIB power supply 360, such as 
a 120 Volt AC/7.5 Volt DC power adapter, is connected to 
the DC power connector J2 220. Pin 1 of the connector J2 
connects via line 370 to provide the DC voltage to a VIN pin 
of the LT1376 high frequency step -down switching regula- 
tor 222. Pin 2 of the connector J2 connects to ground via line 
50 372. The regulator 222, along with the external components 
suggested in the data sheet for the Linear Technology 
LT1376 component, provides a positive 5V output on a 
VCC5 line 374. The VCC5 line 374 connects to the other 
components on the RIB 104 to provide power to each RIB 
55 component. The VCC5 line 374 also connects to a fuse 224. 
In one embodiment the fuse 224 may be rated at 300 
milliAmperes. The fuse 224 further connects via XVCC5 
line 376 to pin 5 of RJ-45 connector 226, thereby providing 
300 mA, positive 5V bias power to be fed to the server 
60 microcontroller network 102 (FIG. 1). The extender micro- 
controller bus clock (XLCL) and data (XLDA) signals 
to/from the switch 228 (FIG. 3) also connect to the RJ-45 
connector 226 at pins 2 and 4, respectively. These signals 
correspond to I 2 C clock and data signals. 

Referring to FIG. 6, the bias power portion of the server 
100 will now be described. The bias power portion of the 
server is generally located at the backplane 152 (FIG. 2). 



65 
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The RJ-45 cable 103 (FIG. 1) interconnects the RJ-45 a remote interface power supply that is independent from 

connector 226 (FIG. 5) and a RJ-45 connector 406 at the the first computer power supply; and 

server 100. Pin 5 of the RJ-45 connector 406 provides the a rem ote interface circuit that receives power from the 

22 ^ P.° Sil fno ^ V , bia f P ° We J fr ° m th / W°£? n , a line remote interface power supply so as to power at least a 

408 The line 408 feeds the anode side of a diode404. In one s computational device in the remote interface circuit and 

embodiment, the diode is a type MBRS320. The cathode • « . . . ■ JflM „ M _ . . n 

side of diode 404 is the BIA&V power 400 for selected ^^^^2^^ 

components of the server 100. The selected components r . f r & 

include the Chassis controller 170, the System Recorder 110 „ £r iato ™ a * on - 

and the NVRAM 112 as shown in FIG. 6. Other components in 2 ' ^system defined in claim 1, additionally composing 

closely affiliated with the Chassis controller 170 and System 10 a second computer m data communication with the first 

Recorder 110 are also powered by the bias power. Of course, computer through the remote interface circuit. 

in other embodiments, other components of the diagnostic 3 - The system defined in claim 2, wherein the second 

network 102 or of the server 100 could be fed power by the computer displays the status information. 

RIB 104 via its independent power supply 360. 4 The system defined in claim 1, wherein the independent 

The diode 404 prevents the 5V bias power from the server 15 P ower P rovided b ? rcmote ioterface circuit is ***** 

power supply 410 from being supplied to the RIB 104 via &om the first computer power supply by a diode 

the RJ-45 cable 103. However, when the server power 5. The system defined m cl^ 1 wherein the mdependent 

supply is off, the bias power from the RIB 104 flows on line P ower su PPly includes an AC to DC adapter. 

408 through diode 404 to supply the bias power driven f ^ s y stem defined in claim l > wherein the status 

components of the server 100. In addition, when the server 20 "^formation is stored in a system log. 

5V bias power is below a nominal voltage, the RIB supplied 7 ' ^ s y stem defined in claim 6 > wherein the system log 

bias power engages to brings the bias_5V voltage up to 5V. ^°j^ m a non - volatlle > raQdom acccss memor >' 

The extender microcontroller bus clock (XLCL) and data ^ a _ t ~ , . , . _ t . t vrT m . _ , . 

(XLDA) signals link to a microcontroller bus extender „ 8 - ^ ^ f lefincd t ln 1 ? Mm 7 ' whcr6m ,h6 tfWRMA ,S 

circuit 402. The bus extender 402 is a buffer providing drive 25 »<^sed by a microconholler. 

capabilityfortheclockanddausignals-lnoneembodiment, 9 ^ ^ ^l^T h w herem indep'nden 

the bus extender 402 is a 82B715 chip available from Philips P ower , ,s P r0Vlded l ° £ ret com P uter when the fitst 

Semiconductor. The outputs of the bus extender 402 are the computer power supply is off 

serial clock (SCL) and serial data (SDA) signals of the „ 10 ™ e defined in claim 1, wherem mdependen 

microcontroller bus 160. These two signals on the micro- 30 P ower 1S P rov,ded l ° lhe . flrsl C0 , m P uler wl > en J". flist 

controller bus 160 connect to the Chassis controller 170 and P° wer 15 ">°P erable ° r °P erat ">8 below » 

System Recorder 110, as previously described. mr 5 1 stl £i a P°^ er ~ V « L A ■ , ■ . k ■ , u >■ r 

' .»...,, . o 11. The system denned in claim 1, wherein the portion of 

An example of using the independent P°wenng aspec of ^ fiKt fcr a micr0COQtroller . 

the server system 100 will now be described. In the event of ^ A „ * r . A A t ~p . r-_ f 

c ./ , . , J4n . ^ it _ 35 12. A system tor mdependent powering or a nrst 

a server failure where the server power supply 410 is off, the n ' 

. ^ 1 computer, comprising: 

server 5V bias power is not available for the server com- „ . . - 

ponents. When this situation occurs, the RIB 104 supplies a first com P uler stonn S status ^formation; 

the bias power to the bias powered components on the a first computer power supply that supplies power to the 

server. The loss of power by the server power supply 410 is 40 computer; 

reported as an event by the Chassis controller 170 (which is a remote interface power supply, wherein the remote 

powered by the RIM supplied bias power) to the RIB interface power supply provides power that is indepen- 

microcontroller 200 (FIG. 3) via the microcontroller net- dent of the power supplied by the first computer power 

work 102. This event is sent to the Recovery Manager 130 supply; and 

(FIG. 1) so as to be displayed to a user of the client computer 45 a remote interface circuit that receives the power from the 

122/124. The user may then elect to view the system log in remote interface power supply so as to power at least a 

the NVRAM 112 by use of the Recovery Manager 130 at the computational device in the remote interface circuit and 

client computer 122/124 to determine the cause of the is configured to provide the independent power to only 

problem. After diagnosing the server problem, the user may a portion of the first computer, 

then decide to power up the server by issuing a power up 50 13. The system defined in claim 12, wherein the indepcn- 

command through the Recovery Manager 130 to the Chassis dent power facilitates reading of the status information when 

controller 170. The Chassis controller 170 then powers up the first computer power supply is inoperable or operating 

the server power supply 410 to restore general power to the below a threshold power level. 

server system. 14. The system defined in claim 12, wherein the portion 

While the above detailed description has shown, 55 of the first computer includes a microcontroller, 

described, and pointed out the fundamental novel features of 15. A system for independent powering of a computer, 

the invention as applied to various embodiments, it will be comprising: 

understood that various omissions and substitutions and a first computer having a first computer power supply; and 

changes in the form and details of the system illustrated may a remote interface circuit that receives power from a 

be made by those skilled in the art, without departing from 60 powcr supply t h at is independent from the first com- 

the intent of the invention. puter power supply so as to power at least a computa- 

What is claimed is: tional device in the remote interface circuit and pro- 

1. A system for independent powering of a first computer, v ides independent power to only a portion of the first 

comprising: computer if the first computer power supply is inoper- 

a first computer storing status information; 65 able or operating below a threshold power level. 

a first computer power supply that supplies power to the 16. The system defined in claim 15, wherein the portion 

first computer; of the first computer includes a microcontroller. 
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17. The system defined in claim 16, wherein the remote 
interface circuit includes a remote interface microcontroller 
connected by a bus to the microcontroller in the first 
computer. 

18. The system defined in claim 15, additionally com- S 
prising a second computer in data communication with the 
first computer through the remote interface circuit. 

19. The system defined in claim 18, wherein the second 
computer remotely turns on the first computer power supply 
through the remote interface circuit, wherein a command is 10 
received by the remote interface circuit and the computa- 
tional device executes the command. 

20. The system defined in claim 15, wherein the power 
provided by the remote interface circuit is isolated from the 
first computer power supply by a diode. 15 

21. A system for independent powering of a first 
computer, comprising: 

a first computer storing status information and having a 

system processor; 
a first computer power supply that supplies power to the 20 

first computer; 
a remote interface power supply that is independent from 

the first computer power supply; and 



a remote interface circuit that receives power from the 
remote interface power supply and is configured to 
provide independent power to at least a portion of the 
first computer to facilitate reading of the status infor- 
mation without the system processor being powered. 

22. The system defined in claim 21, additionally com- 
prising a second computer connected to the first computer 
through the remote interface circuit. 

23. The system defined in claim 22, wherein the second 
computer displays the status information. 

24. A system for independent powering of a computer, 
comprising: 

a first computer having a first computer power supply and 
having a system processor; and 

a remote interface circuit that receives power from a 
power supply that is independent from the first com- 
puter power supply and provides independent power to 
at least a portion of the first computer, without power- 
ing the system processor, if the first computer power 
supply is inoperable or operating below a threshold 
power level. 
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